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ABSTRACT 

The  paper  considers  relevant  theoretical  and  methodological  problems  of  development  of  fuzzy-neural  models  of 
decision  making.  Offers  neuro  -  fuzzy  algorithm  of  synthesis  of  fuzzy  inference  systems.  Describes  a  two-stage  adaptive 
synthesis  algorithm  of  fuzzy  inference  systems.  On  the  first  step  carried  out  clustering  of  initial  fuzzy  parameters  in  order 
to  reduce  the  number  of  input  parameters  of  fuzzy  rules,  and  on  the  second  -  the  synthesis  of  Sugeno  type  fuzzy  models 
(inference  rules). 
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INTRODUCTION 

Tasks  of  data  mining  (TAM)  to  identify  hidden  relationships  between  data,  their  classification  and  identifying 
features  predictive  patterns  of  development  and  analysis  of  the  studied  processes  are  solved  in  most  cases,  uncertainty  and 
diverse  nature  of  the  very  large  volume  of  information.  In  these  conditions,  the  results  of  solving  such  tasks  are  derived 
from  the  semi-structured  decision  making  (SSDM)  by  using  intelligent  technologies  of  nature,  including,  soft  computing. 
The  greatest  efficiency  of  solutions  of  these  tasks  is  achieved  when  sharing  their  component  (fuzzy  logic,  neural  networks, 
and  evolutionary  algorithms).  One  of  the  promising  areas  are  submitted  to  the  research  of  theoretical  and  applied  questions 
of  synthesis  of  fuzzy-neural  models  of  decision  making  based  on  the  combination  of  theory  of  fuzzy  sets,  neural  networks 
and  evolutionary  algorithms  used  for  training  and  optimization  of  the  parameters  of  such  models. 

Models  and  algorithms  for  fuzzy  inference  are  central  in  decision-making,  management,  forecasting, 
classification,  pattern  recognition  and  machine  learning  in  an  uncertain  fuzzy  nature.  An  interconnected  set  of  fuzzy 
inference  systems  implemented,  which  are  the  core  of  production  rules  like  "If  A,  then  B».  These  rules  are  formed  on  the 
basis  of  linguistic  sentences  experts. 

In  solving  practical  problems  in  an  uncertain  fuzzy  nature  of  the  information  required  length  of  construction  and 
implementation  of  decision  support  systems  can  be  divided  into  two  parts:  a  numerical  (quantitative)  and  linguistic 
(qualitative),  coming  from  an  expert.  Much  of  the  FIS  uses  a  second  type  of  knowledge,  often  presented  in  the  form  of 
fuzzy  rule  base.  They  show  the  structure  of  the  fuzzy  model  of  the  problem  as  a  whole  and  provide  the  basic  knowledge 
(expert  information)  on  the  simulated  system,  ie  the  main  component  of  "intelligence"  of  the  problem.  Therefore,  the 
correct  formation  of  the  fuzzy  rule  base  is  essential  to  effectively  address  the  problem.  In  order  for  this  model  was 
adequate  to  the  real  situation,  the  number  of  generated  rules  is  usually  FIS  must  be  equal  to  the  number  of  conditions  A 
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rule  that  is  number  of  elements  in  input  vector  system.  Excessively  large  number  of  them  increases  the  dimension  of  the 
model  and,  consequently,  the  complexity  of  the  problem  being  solved.  In  addition,  the  amount  of  available  information 
available,  including  the  expert,  the  simulated  system  is  often  insufficient  to  build  a  more  complex  and  adequate  model. 
It  should  also  take  into  account  the  existence  of  objective  limitations  on  the  accuracy  of  obtaining  the  source  data. 
Therefore,  their  formation  and  evaluation  in  the  process  of  building  the  investigated  models  use  the  principle  of  reasonable 
completeness  and  accuracy.  This  leads  to  the  importance  of  systematization  and  classification  of  the  initial  information  in 
order  to  reduce  the  number  of  reasonable  rules  of  FIS.  One  of  the  most  promising  approaches  to  the  formation  of  fuzzy 
rules  and  adjust  the  values  of  their  parameters,  especially  when  there  are  only  numerical  data  are  fuzzy  neural  networks 
(fuzzy-neural)  [2].  For  all  the  merits  of  their  main  drawback  is  the  length  of  the  construction  of  fuzzy  rule  base  in  the 
process  of  iterative  learning  neural  networks.  It  is  therefore  appropriate  to  investigate  the  possibility  of  combining  the 
methods  of  classification  and  clustering  with  neural  network  methods. 

In  the  construction  of  classification  and  clustering  procedures  were  reviewed  and  classified  the  various  criteria 

[5-9]. 

The  methods  and  algorithms  for  classification  and  clustering  are  the  following:  artificial  neural  networks,  decision 
trees,  symbolic  rules,  methods,  nearest  neighbor  and  k-nearest  neighbor,  support  vector,  Bayesian  networks,  linear 
regression,  correlation  and  regression  analysis,  hierarchical  cluster  analysis  methods,  non-hierarchical  cluster  analysis 
methods,  including  algorithms  for  k-means  and  k-median,  and  methods  of  mining  association  rules,  including  the 
algorithm  Apriori;  limited  exhaustive  search  method,  evolutionary  programming  and  genetic  algorithms,  a  variety  of 
methods  for  data  visualization  and  many  other  methods. 

In  this  classification,  there  are  two  groups  of  methods: 

•  Statistical  methods  based  on  the  use  of  average  experience,  which  is  reflected  in  historical  data; 

•  Cybernetic  methods,  involving  many  diverse  mathematical  approaches. 
Arsenal  Statistical  Data  mining  methods  classified  into  four  groups  of  methods 

•  Descriptive  analysis  and  a  description  of  the  source  data. 

•  Analysis  of  the  relations  (Correlation  and  regression  analysis,  factor  analysis,  analysis  of  variance). 

•  Multivariate  statistical  analysis  (Component  analysis,  discriminant  analysis,  multivariate  regression  analysis, 
canonical  correlation,  etc.). 

•  Time  series  analysis  (Forecasting  and  dynamic  models). 

•  The  second  area  -  a  variety  of  approaches,  united  by  the  idea  of  computer  mathematics  and  the  use  of  artificial 
intelligence  theory. 

This  group  includes  such  methods: 

•  Artificial  neural  networks  (Recognition,  clustering,  prediction); 

•  Evolutionary  programming  (Including  the  algorithms  of  the  method  of  group  accounting  of  arguments); 

•  Genetic  algorithms  (Optimization); 
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•  Associative  memory  (Search  for  analogues,  prototypes); 

•  Fuzzy  logic; 

•  Decision  trees; 

•  Processing  expertise. 

The  methods  designed  to  obtain  descriptive  results  are  iterative  cluster  analysis  methods,  including  k-means 
algorithm,  k-median,  hierarchical  cluster  analysis  methods,  Kohonen  self-organizing  maps,  cross-tabular  methods  of 
visualization,  imaging,  and  various  others.  Prediction  methods  use  the  values  of  some  variables  to  predict  /  forecast  the 
unknown  (missing)  or  future  values  of  other  (target)  variables. 

The  methods  aimed  at  obtaining  predictive  results,  include  such  methods:  neural  networks,  decision  trees,  linear 
regression,  nearest  neighbor,  support  vector,  etc. 

Bayesian  networks  (networks  of  trust)  [5]  -  is  an  approach  to  classification  based  on  a  combination  of  Bayesian 
approach  and  graph  theory.  In  this  case,  each  vertex  of  the  graph  corresponds  to  a  component  of  the  feature  vectors,  arcs 
represent  a  causal  relationship.  Networking  can  be  done  automatically  by  analyzing  the  correlation  of  the  components  of 
signs.  This  approach  does  not  require  such  strong  assumptions,  as  the  principle  of  maximum  a  posteriori  probability, 
however,  does  not  have  the  theoretical  appeal,  that  is,  in  the  absence  of  a  priori  data  network  is  not  built  to  deliver  a 
minimum  total  risk.  Ill-conditioning,  however,  is  a  problem  for  the  Bayesian  network,  as  large  dimension  of  feature  vector 
makes  a  very  complex  graph  of  relationships  for  the  design  and  analysis.  It  also  greatly  increases  the  computational 
complexity.  One  solution  to  this  problem  is  to  reduce  the  dimension  of  feature  vector,  which  leads  to  a  deterioration  of  the 
generalization  capability. 

The  original  method  of  SVM  was  proposed  by  Vapnik  in  1963  [6]  as  a  method  for  constructing  an  optimal  linear 
classifier.  Although  the  assumption  of  linear  separability  of  classes  less  severe  than,  for  example,  the  assumption  of  the 
principle  of  maximum  a  posteriori  probability,  in  practice  it  is  rarely  performed  In  1992,  proposed  a  way  to  generalize  the 
method  of  support  vector  machines  for  a  wide  class  of  nonlinear  problems  [7].  The  classical  algorithm  is  to  construct  a 
linear  separating  surface  (hyperplane),  equidistant  from  the  convex  hulls  of  the  classes  that  are  based  on  precedents. 
Argued  that  such  a  separating  hyperplane  is  optimal  in  terms  of  overall  risk,  with  respect  to  any  other  possible  hyperplanes. 
If  such  a  hyperplane  does  not  exist  (not  linearly  separable  classes),  then  for  classification  of  non-linear  transformation  of 
Sound  applied,  projecting  the  original  space  into  the  space  even  more,  possibly  infinite,  dimension.  The  linear  separating 
surface  in  the  Sound-induced  transformation  of  the  space  is  nonlinear  in  the  original.  Thus,  the  partially  solved  the  problem 
of  nonlinear  classification.  The  algorithm  of  constructing  the  classifier  is  reduced  to  a  quadratic  programming  problem  and 
solved  by  well-known  optimization  methods  [8].  It  should  be  noted  that  the  solution  of  quadratic  programming  is  unique 
and  found  a  global  extremum. 

The  advantages  of  the  algorithm  k-nearest  neighbors  are: 

•  Education  is  to  memorize  the  training  set. 

•  Ease  of  implementation  and  ability  to  enter  additional  settings. 

•  Precedent  logic  of  the  algorithm  is  well  understood  by  experts  in  subject  areas  (medicine,  biometrics,  law). 
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The  disadvantage  of  this  algorithm  are 

•  The  need  to  store  the  entire  training  set,  which  leads  to  inefficient  memory  usage. 

•  A  large  number  of  operations  for  the  classification  of  images.  As  a  consequence  of  working  with  large  samples  is 
much  longer  than  traditional  neural  networks. 

Chain-type  radial  basis  function  (RBF)  has  an  intermediate  layer  of  radial  elements,  each  of  which  produces  a 
Gaussian  response  surface.  Since  these  functions  are  nonlinear,  to  simulate  an  arbitrary  function  is  not  required  to  take 
more  than  one  intermediate  layer.  To  simulate  each  function  is  only  necessary  to  take  a  sufficient  number  of  radial 
elements.  It  remains  to  solve  the  problem  of  how  to  combine  the  outputs  of  the  hidden  radial  elements  to  obtain  the  output 
of  these  networks.  It  turns  out  that  it  is  enough  to  take  their  linear  combination  (ie  the  weighted  sum  of  Gaussian 
functions).  RBF  network  has  an  output  layer  consisting  of  elements  with  linear  activation  functions  (Haykin,  1994;  Bishop, 
1995).  The  corresponding  algorithms,  although  faster  learning  algorithms  MLP,  to  a  lesser  extent  suitable  for  finding 
suboptimal  solutions.  Consequently,  a  model  based  on  RBF,  will  run  slower  and  require  more  memory  than  the 
corresponding  MLP  (but  it  is  much  faster  than  trains,  and  in  some  cases  it  is  more  important). 

According  to  the  analysis  of  the  existing  clustering  methods  proposed  combined  method  of  constructing  fuzzy 
rule  base  using  the  clustering  based  on  fuzzy  relations  and  fuzzy  neural  networks. 

The  advantage  of  this  method  lies  in  its  simplicity  and  high  efficiency.  In  addition,  it  allows  you  to  combine  the 
numerical  information  provided  in  the  form  of  training  data  from  the  linguistic  information  which  is  kind  of  rule  base,  by 
supplementing  the  existing  data  base  rules  that  were  created  on  the  basis  of  numerical  data. 

Synthesis  algorithm  FIS  rules  and  settings  of  their  parameters  is  implemented  in  two  phases. 

At  the  first  stage,  the  clustering  (clustering)  of  initial  input  variables  of  the  rules  laid  down  and  formed  the 
analysis  of  the  situation  under  study.  Each  of  the  formed  cluster  comprises  a  group  of  original  input  variables  that  are 
similar  in  some  sense.  In  this  case,  each  cluster  is  represented  as  a  generalized  condition  of  the  relevant  rules  of  FIS. 
The  result  is  an  opportunity  to  significantly  reduce  the  number  of  generated  rules  FIS,  corresponding  to  the  number  of 
clusters  formed  by  the  input  variables.  At  this  stage  the  rules  of  the  preliminary,  rough  values  of  the  parameters  that 
describe  the  mathematical  model  of  membership  functions.  In  the  second  stage,  refinement  and  tuning  of  these  parameters 
using  fuzzy  neural  networks  and  different  training  procedures. 

To  construct  the  fuzzy  rule  base  using  the  clustering  based  on  fuzzy  relations  and  fuzzy  neural  networks,  each 
data  set  was  randomly  divided  into  10  disjoint  subsets.  One  subset  was  used  as  the  test  set,  the  remaining  nine  subsets  -  as 
a  set  of  training.  The  experimental  results  show  that  the  proposed  approach  provides  the  classification  in  most  cases  more 
accurate  than  other  methods. 

Formulation  of  the  Problem  of  Synthesis  of  the  Rules  of  the  FIS 

The  problem  of  fuzzy  inference  (NLV),  described  by  a  fuzzy  Sugeno  model  [2,  10]: 


(1) 
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Here:  Xi ,  y .  -  input  and  output  variables  j  =  1,  m  -  number  of  the  rule  at  j  -  a  linguistic  term,  which  is 
estimated  input  variable  X; ,  in  conjunction  with  the  line-number  jp  ( jp  =  1,  kj  )  j  -  th  rule  W ,p  =  [0,1]  -  rule  weighting 

with  the  number  of  jp.The  conclusion  of  the  fuzzy  rules  y  •  =  f(x1 ,  X2  Xn  )  can  be  described,  for  example,  a 
polynomial  of  the  form 

y j  =bj_0+bjyxl+bj2-x2+...  +  bjn-xn  ,  j  =  l,m. 

In  model  (1)  The  input  variables  are  estimated  fuzzy  terms  ai  jp  (eg,  quantifiers  such  as  OH  -  is  very  low  N  -  low, 

NA  -  below  average,  C  -  average,  BC  -  above  average  in  -  a  tall,  RH  -  very  high),  which  are  described  own  membership 
functions  (AF).  In  general,  the  FP  described  by  the  expression: 

=  — ^  TTT  ■  (2) 

1  +  |  x[ -cj 

V    Ai  J 

Here  c/ ,  S,  -  the  parameters  PT,  j  -  number  of  rules  I  =  a:  ■  -  the  index  term. 
For  the  specific  problem  under  consideration  formed  the  initial  set  of  input  variables  X  0 ,  as  well  as  the  training  set  as  a  set 
of  pairs  "input  -  out  "with  fixed  (measured)  values  of  the  input  X    =  {X   }  =  {xlq       Xnq  }  and  the  corresponding 

output  parameters  Y    =  {y  }  ,  (J  =  1,N  -  the  number  of  samples. 

Must  first  reduce  the  set  X  0  into  a  set  X  of  smaller  dimension  and  shape  corresponding  to  the  elements  of  rules 

with  pre-OP  values  of  the  parameters.  Then  you  must  find  by  setting  the  parameter  values  c/  ,  sj  for  which  the  deviation 

of  actual  values  of  the  current  rules  of  the  conclusions  of  NLV  (1)  of  the  values  enshrined  in  the  training  set  of  reference 
will  be  minimal.  To  solve  the  above  problem  of  synthesis  described  below  FIS  developed  adaptive  algorithms  for 
clustering  the  input  variables  of  the  model  (1)  and  AF  settings  (2). 

The  Clustering  Algorithm  Input  Parameters  of  Fuzzy  Rules  FIS 

The  main  purpose  of  this  algorithm,  implemented  in  the  first  stage  is  to  reduce  the  initial  number  of  input 
parameters  of  the  rules  (1)  and,  accordingly,  the  number  of  rules,  procedures  using  clustering  and  formation  rules  FIS  with 
the  preliminary  (coarse)  values  of  the  parameters  describing  their  PC  (2).  Well-known  clustering  algorithms  K-Means  and 
Expectation  Maximization  [5]  impose  restrictions  on  the  geometry  of  clusters  obtained  by  requiring,  inter  alia,  the 
possibility  of  coverage  for  each  individual  cluster  convex.  Such  a  restriction  imposed  by  these  algorithms  are  used 
assumptions  about  the  existence  of  centers  of  clusters  (K-Means),  or  the  probability  density  function  for  each  cluster  with 
the  corresponding  values  of  the  expectation  and  variance  (Expectation  Maximization).  Therefore,  these  algorithms  can  not 
adequately  be  divided  into  clusters  of  non-convex  sets,  the  more  sub-structures.  This  solves  the  problem  described  by  the 
following  algorithm  for  clustering  a  finite  set  of  elements  of  arbitrary  metric  spaces  based  on  partitioning  the  original  set  of 
equivalence  classes  of  fuzzy  relation.  It  allows  you  to  group  items  into  clusters,  between  which  there  is  a  sequence  of 
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"close"  to  each  other  elements,  which  also  corresponds  to  the  intuitive  idea  of  grouping  Consider  the  clustering  algorithm 
in  the  following  formulation.  Assume  that  you  want  to  build  a  rule  base  for  the  FIS  II  with  n  inputs  and  one  output.  To  do 
this,  you  must  first  create  a  set  of  pairs  of  training  set  data 


(xlq,...,xnq,yq),  q  =  l,N  (3) 

Where  A  q  —  {Xlq Xnq  )  -  given  the  values  of  input  parameters  (conditions)  of  the  module  FIS,  and  yq  -  the 
expected  (reference)  value  of  the  output  parameter  (the  conclusion)  q  -  th  training  sample. 

The  essence  of  the  clustering  procedure  is  as  follows.  Suppose  that:  X-  the  metric  space  d  :  X  — >  R  -  a  certain 
metric  on  it;  (X  l;.X  2,...,  X  N)  CZ  X  -  a  sequence  of  elements  from. 

Assume  that 

Vie{l..,N}3je{l1..,N}:Xi*Xj  (4) 

From  the  condition  (4)  that  Vl  £  {l,...,  N}  the  fair: 

max{d(Xi,Xk)\ke{l,..N}}>0  (5) 

Thus,  for  each  index  i  we  can  define  a  function  that  describes  the  similarity  measure  the  j-th  element  of  the  i-th 


element: 


£  :{1,...,N}^  [0,1],  £U)  :=  1  ^ ''f  j) 

max{d(Xi,Xk)\ke  \1,...,N\\ 


For  each  index  i,  we  define  a  function  that  describes  the  similarity  measure  the  k-th  and  1-th  element  with  respect 
to  the  i-th  element: 

gt  :{U,tf}2->[0,l], 

:=l-|6(**)-6(*z)|- 

We  now  define  a  function  describing  a  measure  of  the  similarity  of  any  two  elements  in  the  sequence 
jU(i,  j)  -  with  respect  to  all  elements  of  the  sequence: 

ju:{l,...,Nf  ->[0,..l], 

ju(i,  j)  :=  minfo  (/,  j)\ke  {l,...,  A^}} 

For  all  k  and  i 

For  k  =  1,  2,     N  recursively  defined  functions  JLl(k)  :  {l,...,  A^}2  [0,l] 
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fill)(i,j)  :=M(Uj), 

<   , 

M(k)(iJ)  ■=  max{min{^(i-1)(/^),//a-1)(^j)}l^{l,...,^V}} 
ju(k>(i,i)  =  iyi,k. 

ju(k>(i,j)>ju(k-r>(ij)yk>2. 

For  each  level  OC  G  [0,l],  we  define  a  binary  relation  \XX  ,.X 2  X N  }  on  the  set  Ra  C  {Xx  ,.X2  XN  }2 
as  follows: 

(Xi,X.)sRa  :^ju(N)(i,j)>a. 

Is  an  equivalence  relation  Ra  which  divides  the  set  {Xl ,  X2 XN }  into  disjoint  equivalence  classes. 
Two  elements  Xt,  X-  are  in  the  same  equivalence  class  if  and  only  if  the  value  of  measures  of  similarity  (proximity) 
JUN  (i,  j)  for  these  elements  is  large 

ju  N  (i,  j)  =  max(ji  N (i,      jUN (j,,  j2),...jUN  (jr ,  j)) 

With  the  use  of  the  proposed  measures  of  similarity  shaped  fuzzy  rules,  which  allow  constructed  on  the  basis  of 
their  strategic  offensive  arms  to  generate,  for  given  values  of  input  parameters,  output  parameters  (the  conclusion)  with  the 
smallest  deviation  of  their  current  values  from  the  standard  set  forth  in  the  training  set  (3). 

The  clustering  algorithm  implemented  in  the  following  sequence. 

Step  1:  The  division  of  the  set  {X] ,  X2 X N  }  into  disjoint  equivalence  classes.  Imagine  that  we  know  the 
minimum  and  maximum  values  of  value  each  input  and  output  information.  You  can  identify  the  intervals  in  which  there 
are  valid  values.  For  the  input  signal  xt  is  the  interval  denoted  [xt  ,  X*  ] .  If  the  values  Xiq  and  X+lq  are  unknown,  and  it  is 
possible  to  use  the  training  data  and  choose  the  corretively  the  minimum  and  maximum  values. 

Each  interval  is  defined  in  such  a  way  to  divide  into  regions  {segments),  and  the  value  of  K  for  each  signal 
selecting  individually,  and  the  segments  may  have  the  same  or  different  lengths. 

To  estimate  the  values  of  linguistic  variables,  we  use  the  above  seven-level  scale  of  quantifiers  terms.  Each  of 
these  terms  is  a  fuzzy  set  given  by  an  appropriate  membership  function. 

Using  the  introduced  qualitative  terms  (classifiers)  and  expert  knowledge,  fuzzy  rules  represented  in  a  table  whose 

elements  are  the  terms  of  the  membership  function  of  fuzzy  rules.  Using  the  table,  and  operations  f\  (and  -  min)  and  V 
(OR  -  max),  it  is  easy  to  write  the  system  of  fuzzy  logic  equations  relating  the  conclusions  of  the  membership  function  of 
fuzzy  inference  and  the  input  variables. 
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In  general,  each  variable  is  the  input  vector  X    —  (xlq Xnq )    q  —  \,N   has  its  own  membership 

functions  of  fuzzy  terms  (OH,  H,  HC,  C,  Sun,  B,  S),  which  are  used  in  the  rules  of  FIS.  To  simplify  the  model  we  use  for 
all  the  variables  of  the  input  vector  only  one  form  of  membership  functions. 

Step  2:  Construction  of  fuzzy  rules  based  on  the  training  data. 

First  we  define  the  degree  of  membership  of  training  data  (3)  to  each  region,  selected  in  step  1 .  These  degrees  are 
expressed  by  the  values  of  PT  corresponding  group  of  fuzzy  sets  of  data. 

The  main  advantages  of  the  considered  algorithms  are: 

•  No  a  priori  assumptions  about  the  structure  of  the  source  data  (Type  and  parameters  of  the  probability  distribution 
of  the  clusters,  the  centers  of  density,  the  number  of  clusters). 

•  The  clear  interpretation  of  the  results  of  the  partition  of  clusters:  the  elements  are  in  the  same  cluster,  when  the 
sequence  between  them  is  close  to  each  other  elements. 

•  No  restrictions  on  the  geometry  of  clusters.  Obtained  with  this  algorithm  clusters  can  be  of  any  geometric  shape 
sets,  including  non-convex.  This  distinguishes  this  algorithm  from  the  known  clustering  algorithms  (Such  as 
modifications  of  the  algorithms  K-Means,  Expectation  Maximization). 

Identification  Algorithm  and  Tuning  Parameters  of  Fuzzy  Rules  FIS 

To  identify  the  parameters  of  the  conclusions  of  the  rules  (1)  is  proposed  to  use  the  following  neuro  -  fuzzy 
algorithm: 

1 .  Fixed  values  of  the  input  and  output  parameters  of  the  object  state: 

x;=(x;q,..,x:q),Y*={y;}  q=iN. 

2.  Determined  by  the  values  of  membership  functions  of  input  parameters  JUq (x-  )  for  fixed  values  of  the 
vector  A  q  =  {Xiq,...,Xnq)  . 

3.  We    calculate    the    values    of    membership    functions    of    output    parameters    for    fixed  values 

yq  (  *      *  *  \  v  *  —  (  * 

M    \xlq>x2q>—>xnq)  of  the  vector  A?  -  {Xlq,...,Xnq)  . 

4.  By  training  the  neural  network  (NN)  are  chosen  such  values  Cqk,sl  of  membership  functions  (2)  that  minimize 

the  value  of  the  residuals  Et  =  y*q~  yq,  ie  difference  between  the  fixed  real  values  of  output  parameters  of  the  object 

iyq)  and  values  of  output  parameters  (  yq  ),  which  are  formed  at  the  output  Fuzzy  NA,  which  approximates  the  rules  (1). 
As  a  result,  the  value  determined  by  the  fuzzy  output  of  the  National  Assembly,  for  which: 

ju  9{xl,x2,...,xn)=maK\ju  "  \xl  ,x2,...,xj\ 

q=\,n 
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For  training  the  neuro-fuzzy  network  uses  a  system  of  recurrence  relations,  which  are  a  modification  of  the  error 
back-propagation  algorithm,  which  for  the  (t  +l)-th  iteration  of  the  training  are  as  follows: 


c<(t  +  l)  =  c\ (0  -Tfiyt-  y, )  — ^  '^—2  ^—  Tl  Mk (x? )       2  22 


(  N 

2>*(yi) 

V  M 


5|  a + 1) = <  (o  -  77(j,  -  jD 


i-l  i-1  1 


/  N  \2 

E  //(.v;) 


2(^)2(xf-c«) 


The  learning  algorithm  of  neuro-fuzzy  network  consists  of  two  phases.  In  the  first  phase,  the  model  calculates  the 
output  value  of  the  object  (  y  ),  corresponding  to  a  given  network  architecture.  In  the  second  phase  of  the  residual  value  is 

calculated  (Et)  and  recalculated  the  parameters  of  membership  functions. 
Computing  Experiment 

A  computational  experiment  to  evaluate  the  effectiveness  of  the  proposed  method  of  classification  was  carried  out 
on  databases  UCI  machine  learning.  Repository  UCI  (UCI  Machine  Learning  Repository)  is  a  set  of  real  and  model  of 
machine  learning  tasks,  which  are  used  by  the  scientific  community  for  the  empirical  analysis  of  machine  learning 
algorithms.  Contains  the  actual  data  on  applied  problems  in  biology,  medicine,  physics,  engineering,  sociology,  and  other 
archive  was  established  in  1987  by  David  Aha  and  colleagues  graduate  students  at  UC  Irvine  (School  of  Information  & 
Computer  Science  University  of  California,  Irvine,  USA,  http://www.ics.uci.edu).  Since  that  time,  it  is  widely  used  by 
students,  faculty  and  researchers  around  the  world  as  a  major  source  of  machine  learning  data  sets. 
For  the  experiment  were  chosen  six  different  types  of  real  databases  (Table  1). 


Table  1 


Database 

Number  of 

Number  of 

Number  of 

Classes 

Features 

Objects 

Glass 

7 

9 

214 

Haberman 

2 

4 

306 

Iris 

3 

4 

150 

Ecology 

8 

7 

336 

Wine 

3 

13 

178 

Livers 

2 

6 

345 

Table  2  shows  the  comparative  results  of  the  classification  accuracy  of  the  proposed  method  and  the  methods 
given  in  [14]. 
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Table  2 


Methods  of  Classification 

Database 

Proposed 
Method 

SVM 

INN 

KNN 

Conventional 
RBF  Network 

Glass 

87.85 

71.50 

72.01 

72.01 

69.16 

Hpnc 

98.3 

97.33 

96.00 

95.33 

95.33 

Bhho 

98.88 

99.44 

95.52 

96.07 

98.89 

Table  3  shows  the  comparative  results  of  the  classification  accuracy  UCI  databases  by  the  proposed  method  and 
the  method  of  SVM  [16].  To  construct  the  fuzzy  rule  base  using  the  clustering  based  on  fuzzy  relations  and  fuzzy  neural 
networks  were  variants  of  the  calculations.  Option  that  gives  the  best  lower  percentage  of  errors  is  accepted  as  a  good 
result,  and  vice  versa,  which  gives  the  largest  percentage  of  error,  taken  as  a  bad  result. 


Table  3 


Haberman 

82.7 

87.5 

85.1 

72.3 

82.1 

78.8 

Liver 

78.4 

86 

82.3 

60.4 

68.3 

65.5 

Ecoli 

88.5 

94.2 

91.8 

89.4 

94.4 

92.3 

Table  4  shows  the  comparative  results  of  the  classification  accuracy  of  the  proposed  method  and  the  CBA 
(Classification  based  on  associations)  [15]. 

Table  4 


Database 

Proposed 
Method 

CBA 

Glass 

87.85 

65.28 

Hpnc 

98.3 

94.00 

Bhho 

98.88 

86.67 

Experiments  were  performed  10  times  for  each  data  set  using  a  10-fold  cross-validation  (cross  validation). 
CONCLUSIONS 

The  proposed  adaptive  algorithm  allows  to  simplify  the  procedures  for  the  synthesis  of  fuzzy  rules  FIS  by 
substantially  reducing  the  dimension  of  the  original  variables  and  implement  operational  adjustments  of  fuzzy  models  in 
terms  of  changing  environmental  parameters.  The  results  of  computational  experiments  conducted  to  classify  the  elements 
of  the  test  databases  using  known  methods,  showed  a  higher  accuracy  of  the  proposed  two-stage  adaptive  algorithm  for 
classification.  The  proposed  approach  was  tested  for  solving  the  problems  of  forecasting  using  real  data.  The  results 
showed  high  efficiency  of  fuzzy  prediction  models,  synthesized  by  the  proposed  algorithm  [11-13]. 

A  promising  line  of  research  on  the  above  issues  is  to  develop  methods  and  algorithms  for  the  synthesis  of  the 
rules  of  the  FIS  using  a  combination  of  "Soft  Computing"  -  technologies:  fuzzy  sets,  neural  networks  and  genetic 
algorithms. 
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